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Abstract 

Synchronous Tree Adjoining Grammars 
can be used for Machine Translation. How- 
ever, translating a free order language such 
as Korean to English is complicated. I 
present a mechanism to translate scram- 
bled Korean sentences into English by com- 
bining the concepts of Multi-Component 
TAGs (MC-TAGs) and Synchronous TAGs 
(STAGs). 



1 Motivation 

Tree Adjoining Grammars (TAGs) were first devel- 
oped by Joshi, Levy, and Takahashi (Joshi et al., 
1975). There are other variants of TAGs such as 
STAGs (Shieber and Schabes, 1990), and MC-TAGs 
(Weir, 1988). STAGs in particular can be used for 
machine translation and were applied to Korean- 
English machine translation in a military message 
domain (Palmer et al., 1995). 

Park (Park, 1995) suggested a way of handling 
Korean scrambling using MC-TAGs together with a 
priority concept. However, as scrambled argument 
structures in Korean were represented as sets using 
MC-TAGs, a mechanism to combine MC-TAGs and 
STAGs was necessary to translate Korean scrambled 
sentences into English. 

2 Korean-English Machine 
Translation Using STAGs 

STAGs are a variant of TAGs introduced to charac- 
terize correspondences between tree adjoining lan- 
guages. They can be used to relate TAGs for two dif- 
ferent languages for machine translation (Abeille et 
al., 1990). The translation process consists of three 
steps. The source sentence is parsed according to the 
source grammar. Each elementary tree in the deriva- 
tion is considered with the features given from the 
derivation through unification. Second, the source 
derivation tree is transferred to a target derivation. 
This step maps each elementary tree in the source 
derivation tree to a tree in the target derivation tree 



by looking in the transfer lexicon. And finally, the 
target sentence is generated from the target deriva- 
tion tree obtained in the previous step. 

The transfer lexicon consists of pairs of trees, one 
from the source language and the other from the 
target language. Within the pair of trees, nodes may 
be linked. Whenever adjunction or substitution is 
performed on a linked node in a source tree, the 
corresponding operation applies to the linked node 
in the target tree. 



Tom Tom 



SP^ VP 



Jerry Jerry 



NP^ VP 



NP^ 



ccossnunta chases 



Figure 1: The K-E Transfer Lexicon 



Canonical ordering of the arguments of transitive 
verbs in Korean is SOV. Whereas the case marker 
in English is implicit in the word, case markers are 
explicit in Korean. This is reflected in the transfer 
lexicon of Figure 1. So, the pair a in Figure 1 shows 
that Korean has an explicit subject case marker i, 
and the pair (3 shows that Korean has an explicit ob- 
ject case marker lul. Also, the pair 7 shows the links 
between SOV structure of Korean to SVO structure 
of English. 



K. 



Tom-i 
Tom-NOM 
E: Tom 



Jerry-lul 
Jerry- ACC 
chases 



ccossnunta. 
chase 
Jerry. 

To translate sentence (1), we start with the pair 7 
in Figure 1, and we substitute the pair a on the link 
from the Korean node SP to the English node NP. 
Then, pair fi is substituted into the NP-OP pairs in 
7, thus correctly transferring sentence (1). 




3 Handling of Scrambling in Korean 
Using MC-TAGs 

TAGs and related formalisms, due to the extended 
domain of locality, can combine a lexical head and all 
of its arguments in a single elementary structure of 
the grammar. However, Becker and Rambow show 
that TAGs that obey the co-occurrence constraint 
cannot handle the full range of scrambled sentences 
(Becker and Rambow, 1990). As a result, non-local 
MC-TAG-DL (Multi-Component TAG with Dom- 
inance Link) was proposed as a way of handling 
scrambling 1 . Later, by adding a priority concept 
to MC-TAG-DL, Park (Park, 1995) suggested a way 
of handling scrambling in Korean. 

3.1 aAUg & 0AKG structures 



Tom, No: " 



For handling scrambling, the multi-adjunction 
concept in MC-TAGs can be used for combining a 
scrambled argument and its landing site. For exam- 
ple, a subject (e.g., Tom) would have two Korean 
structures as above. For notational convenience, 
call the two structures, aAlZGsv and fiAlZGsv, re- 
spectively. In general, aAlZG represents a canonical 
NP structure and (5A1ZG represents a scrambled NP 
structure. fiAlZGsv shows a pair of structures for 
representing the scrambled subject argument. Call 
the left structure of flATZGsv, 0ATZG§-p and the 
right structure, PATZG^p- PAlZG§-p represents a 
scrambled subject, and PATZG^v is used for repre- 
senting the place where the subject would have been 
in the canonical sentence. Similarly, fiAlZGov de- 
notes a pair of structures for representing a scram- 
bled object argument. 

The basic idea is that whenever an argument is 
not in a scrambled position, it should be substituted 
into an available empty slot using the aAlZG struc- 
ture. The (5A1ZG structure will be used only when 
the argument is in a scrambled position so that the 
aAlZG structure cannot be used. 



3.2 



An Example 

K: Jerry-lul 



E.- 



Jerry- ACC 
Tom 



Tom-i 

Tom-NOM 

chases 



ccossnunta. 

chase-DECL 

Jerry 



From the elementary trees in Figure 2, both sen- 
tences, (1) and (2) can be derived. For example, 
Figures 2(a), 2(b), and 2(d) can be used for sentence 

(1) , to derive Figure 3(a). However, for sentence 

(2) where the order is OSV (the object argument is 



(a) (b) 



(c)(3ARGov 



(d) 



Figure 2: Elementary Trees 



scrambled), Figures 2(a), 2(c), and 2(d) are used to 
derive Figure 3(b) (flATZGov is adjoined onto S, and 
(3A1ZGq V is substituted into OPi I node.). As the 
trace feature is locally set within each (5A1ZG struc- 
ture, two OP nodes in Figure 3(b) are co-referenced 
with the same variable, < 1 >, indicating where the 
object should have been in the canonical sentence. 




Jerry 

(a) Canonical 



(b) Scrambled 
Figure 3: Derived Trees 



Each elementary tree is given a priority. A higher 
priority is given to aATZG structure over (3ATZG- 
Generally, when a structure given a higher prior- 
ity over others can be successfully used for the final 
derivation of a sentence, the remaining structures 
will not be tried at all. Only when the highest pri- 
ority structure fails will the next available structure 
be tried 2 . 

4 Using MC-TAGs in STAGs 

For mapping Korean to English, the simple object 
(NP) structure of English (e.g., the right structure of 
(3 pair in Figure 1) can be mapped to two structures, 
i.e., aAlZGov an d f3A1ZGov> thus generating two 
possible lexical pairs. 



1 An additional constraint system called dominance 
links was added, thus giving rise to MC-TAG-DL. 



2 As a way of implementing a verb-final condition in 

Korean, PAR-Gsv structure is dominated by PAIZGsv) 
and each S-type verb elementary tree will have an HA 
constraint on the root node, which guarantees that 
PA1ZG type structure cannot be adjoined onto the par- 
tially derived tree unless its predicate structure (its S- 
type verb elementary tree) is already part of the partial 
derived tree up to that point. An example including 
long-distance scrambling is shown in (Park, 1995). 



For translating sentence (1), the aA7ZGov~N~P 
pair is used for Jerry (similar to the (3 pair in Figure 
1). However, in sentence (2), the fiATZGov-NP P & i r 
should be used instead for translating the scrambled 
argument Jerry (i.e., Figure 4(a)). Thus, it is nec- 
essary that a Korean (3A1ZQ structure (MC-TAG) 
be mapped to an English NP structure (TAG) to 
transfer a scrambled argument in Korean. I assume 
that there is one head structure for each MC-TAG 
structure, and that the f5AIZQ n (place holder struc- 
ture) is the head structure for each (iAlZQ struc- 
ture. The root node of the head structure is al- 
ways mapped to the root node of the target (English) 
structure. 

Usually, the nodes in the source language should 
be linked to each relevant node in the target lan- 
guage, and vice versa (in STAGs). However, in the 
case that it is a multi-component structure (e.g., 
3ATZQ), an adjunction node need not necessarily 
be linked to any node. If it is not linked to any 
node of the target language, the structure can be 
freely adjoined onto any available node of the par- 
tially derived tree of the source language, which is 
approximately what scrambling is about. However, 
substitution nodes will always be linked (the differ- 
ence between a substitution node and an adjunction 
node is that an adjunction node does not introduce 
a new structure to the partially derived tree whereas 
a substitution node always does). 




(a)K — E Lexicon 




ccossnunta 
Tom 



(b)K — E DerivedTrees After Applying (a) 

Figure 4: K-E Transfer Lexicon and Derived Tree 

In Figure 4(a), the root node NP of an English 
TAG is mapped to the OP node of PATZQ^ of 
a Korean TAG which is a head structure. All 
the other nodes are mapped to each relevant node 
except Sj. As it is not linked, (3A1ZQq V can be 
adjoined onto any available node in the partially 
derived Korean tree. Actually, the restriction on 
whether (3A1ZQq V can be adjoined onto a certain 



node does not come from the formalism of Syn- 
chronous TAGs, but purely from the grammar of 
Korean TAGs. Figure 4(b) shows the final derived 
trees for both Korean and English after applying 
4(a) to the partially derived trees. 

5 Conclusion and Future Direction 

Using MC-TAGs allows the scrambled argument 
structure to be represented as a single (set) struc- 
ture. This makes possible the mapping of Korean 
scrambled argument structures into English argu- 
ment structures. The application of similar mech- 
anisms for other languages and for mapping quasi 
logical forms to logical forms (Alshawi et al., 1992) 
using STAGs is also being investigated. 
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